-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
external_endpoints should support more than 200 silos and TLS certificates #7291
Conversation
34bfee1
to
f138943
Compare
This reverts commit bafa224.
let batch = datastore | ||
.dns_zones_list(opctx, DnsGroup::External, &p.current_pagparams()) | ||
.await?; | ||
paginator = p.found_batch(&batch, &|z: &DnsZone| z.zone_name.clone()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This all looks correct to me, but out of curiosity: why in the other two queries do we have an explicit PaginatedBy::Id(pageparams)
, and for this one we seem to be implicitly paginating by zone name? Is the use of z.zone_name
here relying on implementation details of dns_zones_list()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I recall correctly: the sort of low-level parameter for pagination is DataPageParams
, which can be parametrized by basically any marker type:
omicron/common/src/api/external/mod.rs
Lines 95 to 120 in 460f038
/// Parameters used to request a specific page of results when listing a | |
/// collection of objects | |
/// | |
/// This is logically analogous to Dropshot's `PageSelector` (plus the limit from | |
/// Dropshot's `PaginationParams). However, this type is HTTP-agnostic. More | |
/// importantly, by the time this struct is generated, we know the type of the | |
/// sort field and we can specialize `DataPageParams` to that type. This makes | |
/// it considerably simpler to implement the backend for most of our paginated | |
/// APIs. | |
/// | |
/// `NameType` is the type of the field used to sort the returned values and it's | |
/// usually `Name`. | |
#[derive(Clone, Debug)] | |
pub struct DataPageParams<'a, NameType> { | |
/// If present, this is the value of the sort field for the last object seen | |
pub marker: Option<&'a NameType>, | |
/// Whether the sort is in ascending order | |
pub direction: PaginationOrder, | |
/// This identifies how many results should be returned on this page. | |
/// Backend implementations must provide this many results unless we're at | |
/// the end of the scan. Dropshot assumes that if we provide fewer results | |
/// than this number, then we're done with the scan. | |
pub limit: NonZeroU32, | |
} |
For collections that only support pagination by one thing, like dns_zones_list()
, their datastore functions just accept DataPageParams
:
pagparams: &DataPageParams<'_, String>, |
For relatively few collections we support paginating by any one of a few different fields (well, it's always "id" or "name"). The caller gets to pick. The code is basically the same and these functions accept a PaginatedBy
, which is just an enum with variants for by-id or by-name, and each variant just contains the appropriate DataPageParams
type:
omicron/common/src/api/external/http_pagination.rs
Lines 364 to 368 in 460f038
#[derive(Debug)] | |
pub enum PaginatedBy<'a> { | |
Id(DataPageParams<'a, Uuid>), | |
Name(DataPageParams<'a, Name>), | |
} |
So I don't think this is relying on an implementation detail. It's just that dns_zones_list()
only supports paginating by the zone name so it accepts DataPageParams
directly, while the other functions support pagination by either and so need the wrapper that lets you pick which one you want (but ultimately you're still providing a DataPageParams
and that's what the code is using).
There's a bit of an implicit detail here which is that the caller of a paginated function needs to know which field of the object is the marker. (In practice this is not super easy to get wrong because the marker type is usually kind of unique -- a uuid or a Name -- though you could pick a different field that's also a uuid or something.) This applies to all three queries though and every other use of Paginator
, too (it's encoded in the closure provided to found_batch()
).
Fixes #7289.
In terms of testing, there are three commits here:
I have run the test twice. With the fix, the test passed. Without the fix, the test failed with:
The key bit is:
That's what we'd expect given the previous behavior: some silos won't have their certificates loaded. I can also see from the log file that we hit the error case from the original issue:
and I can also see these messages:
That's all consistent with what we'd expect if we'd simply not loaded this silo's certificate, which is what we'd expect if we'd hit this limit prior to this fix.
I suspected this test might take a while, and indeed the output shows it took 143s to do this extra work (in the success case). I think this is too expensive to bother including so I've reverted the automated test. It really shouldn't be necessary since
Paginator
is extensively tested elsewhere. I just wanted to be sure I was resolving the original problem.